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Abstract 



^3 In this paper we address the statistical problem of testing if a stationary process is Gaussian. 

1 1 The observation consists in a finite sample path of the process. Using a random projection technique 

introduced and studied in [7] in the frame of goodness of fit test for functional data, we perform 
some decision rules. These rules really stand on the whole distribution of the process and not only on 
«^ its marginal distribution at a fixed order. The main idea is to test the Gaussianity on the marginal 

distribution of some random linear combinations of the process. This leads to consistent decision 
iy-} rules. Some numerical simulations show the pertinence of our approach. 
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1 Introduction 

In many concrete situations the statistician observes a finite path X±, . . . ,X n of a real 
temporal phenomena. A common modeling is to assume that the observation is a finite 
path of a second order weak stationary process X := pQ) t6 ^ ( see ' ^ or exam pl e ; [15]). 
This means that the random variable (r.v.) X t is, for any t 6 Z, square integrable and 
that the mean and the covariance structure of the process is invariant by any translation 

* Those authors have been partially supported by the Spanish Ministerio de Ciencia y Tecnologfa, grant MTM2005- 
08519-C02-02 and the Consejen'a de Education y Cultura de la Junta de Castilla y Leon, grant PAPIJCL VA102/06. 
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on the time index. That is, for any t, s G Z, E(X t ) does not depend on t and E(X t X s ) 
only depends on the distance between t and s. A more popular frame is the Gaussian 
case where the additional Gaussianity assumption on all finite marginal distributions of 
the process PQ) tg g is added. In this the multidimensional Gaussian distribution 

only depends on moments of order one and two, the process is also strongly stationary. 
This means that the law of all finite dimensional marginal distributions are invariant if 
the time is shifted: 

(Xi, ■ ■ ■ , X n ) = (X m , • • • , X t+n ), (t G Z, n G N). 

Gaussian stationary process are very popular because they share plenty of very nice prop- 
erties concerning their statistics or prediction (see, for example, [3j or |29|). Hence, an 
important topic in the field of stationary process is the implementation of a statistical 
procedure that allows to assess Gaussianity. In the last three decades, many works have 
been developed to build such methods. For example, in [11] a test based on the analysis 
of the empirical characteristic function is performed. In [21] based on the skewness and 
kurtosis test or also called Jarque-Bera test. In [24J based on both, empirical charac- 
teristic function and skewness and kurtosis. In [30] we can find another test, this based 
on the bispectral density function. An important drawback of these tests is that they 
only consider a finite order marginal of the process (generally the order one marginal!). 
Obviously, this provides tests at the right level for the intended problem; but these tests 
could be at the nominal power against some non-Gaussian alternatives. For example, in 
the case of a strictly stationary non-Gaussian process having one-dimensional Gaussian 
marginal. 

In this paper, we propose a procedure to assess that a strictly stationary process 
is Gaussian. Our test is consistent against every strictly stationary alternative satisfying 
regularity assumptions. The procedure is a combination of the random projection method 
(see [7] and |B]) and classical methods that allow to assess that the one-dimensional 
marginal of a stationary process is Gaussian (see the previous discussion). 

Regarding the random projection method, we follow the same methodology as the 
one proposed in [8]. Roughly speaking, it is shown therein that (only) a random projec- 
tion characterizes a probability distribution. In particular, we employ the results of [7J 
where the main result of [H] is generalized to obtain goodness-of-fit tests for families of 
distributions, and in particular for Gaussian families. 

Therefore, given a strictly stationary process, (X t ) t£ %, we are interested in constructing 
a test for the null hypothesis H : (Xt)tez is Gaussian. Notice that H holds if, and only 
if, (X t ) t <o is Gaussian. So that, using the random projection method, [7], this is, roughly 
speaking, equivalent to that a (one-dimensional) randomly chosen projection of (X t )t<o 
is Gaussian. This idea allows to translate the problem into another one consisting on 
checking when the one-dimensional marginal of a random transformation of (Xt)tez is 
Gaussian. This can be tested using a usual procedure. Here, we will employ the well- 
known Epps test, [UJ, and Lobato and Velasco skewness-kurtosis test, [21] • We also use 
a combination of them as a way to alleviate some problems that those tests present. 
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Furthermore, Epps test checks whether the characteristic function of the one-dimensional 
marginal of a strictly stationary process coincides with the one of a Gaussian distribu- 
tion. This checking is performed on a fixed finite set of points. As a consequence, it 
cannot be consistent against every possible non-Gaussian alternative with non-Gaussian 
marginal. However, in our work, the points employed in Epps test will be also drawn 
at random. This will provide the consistency of the whole test. Regarding Lobato and 
Velasco skewness-kurtosis test we will prove the consistency of the test under different 
hypothesis than those in |21j . 

The paper is organized as follows. In the next section we will give some basic defini- 
tions and notations. In Section |3j we discuss some useful known results. One concerns 
the random projection method, some Gaussianity tests for strictly stationary processes 
and another a procedure for multiple testing. It also contains a new result characterizing 
Gaussian distributions. In Section [4] we introduce our procedure and analyze its asymp- 
totic behavior. Section [5] contains some details on the practical application of the method 
and Section [6] includes the results of the simulations. The paper ends with a discussion. 
In the whole paper all the processes are assumed to be integrable. 

2 Notations and basic definitions 

If Y is a random variable, we denote by $y its characteristic function; $ Mj7 denotes the 
characteristic function of the Gaussian distribution with mean fi G R and variance 7 > 0. 

EI denotes a separable Hilbert space with inner product (•, •) and norm || • ||. {v n }^ =1 
denotes a generic orthonormal basis of EI and V n the n-dimensional subspace spanned by 
{v 1, . . . , v n }. For any subspace, V C EI we write V 1 - for its orthogonal complement. If D 
is an El-valued random element, then Dy denotes the projection of D on the subspace V 

of e. 

X and (X t ) tG x denote indistinctly a process. Through the following, when we say that 
a process is stationary we mean that it is strictly stationary. Given a stationary process 
X, let us denote, if they exists, nx '■= E[Jf ] the mean and nx,k '■= E[(X — fix) k ], with 
k G N, the centered moment of order k. Further, let yx{t) '■= E[(Jf — /j l x){X t — fix)], 
with t G Z, be the autocovariance of order t. 

Let Xi,X2, ...,X n , n G N be a sample of equally spaced observations of the random 
process X. Let fix '■= n~ x Ym=i ^ e ^ s sam pl e mean, fi x ,k '■= n~ x Y^i=i(Xi — Ax) fc , for 
k G N, its sample centered moment of order k and 



for \t\ < n — 1, the sample autocovariance of order t. When it is clear to which process 
they are referring we suppress the sub index X. Note that then we write nx,k as F° r 
the sake of simplicity, let us denote jx '■= 7x(0) and analogously jx '■= 7x(0). 

Finally, by i.i.d.r.vs. we mean independent and identically distributed random vari- 
ables. 



"■-1*1 
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We assume that all the random elements are defined on the same, rich enough, prob- 
ability space (fl, a, IP). 



3 Preliminary results 

In this section we discuss both a characterization of Gaussian distributions in infinite 
dimensional spaces, a characterization of the one-dimensional Gaussian distributions and 
two tests of Gaussianity for stationary processes. We also recall some facts on multiple 
testing procedure. All this material are tools for our results. 

Excluding the characterization of the one-dimensional Gaussian distributions (Propo- 
sition 3.4), the results in this section are well known and they are included here for the 



sake of completeness. 



3.1 Characterization of Gaussian distributions in Hilbert spaces 

The result of this subsection comes from [TJ. It is based on the use of dissipative distri- 
butions which are defined next. 

Definition 3.1. Let D be an M-valued random element. We will say that its distribution 
is dissipative if there exists an orthonormal basis (t> n )^i ofW, such that 

1. IP (D v ± = 0) = ; for all n > 2 (see Section^for the definition ofV n ). 

2. The conditional distribution of Dy„ given D v ± is absolutely continuous with respect 
to the n-dimensional Lebesgue measure. 

Theorem 3.6 in [7J states the following: 

Theorem 3.2 (Cuesta-Albertos et al. (2007)). Let n be a dissipative distribution on HI. 
// X is an M-valued random element and 

f]({h G EI : the distribution of (X, h) is Gaussian}) > 0, 

then X is Gaussian. 

The importance of this result relies on the fact that if i] is dissipative then the following 
— 1 law holds 

f]({h G EI : the distribution of (X, h) is Gaussian}) G {0, 1}. 

Moreover, X is not Gaussian if, and only if, 

f]({h G EI : the distribution of (X, h) is Gaussian}) = 0. 

In other words, if we ask if the distribution of X is Gaussian, then the only thing we 
have to do is to select at random a point h G H using a dissipative distribution and check 
if the real- valued random variable (X, h) is Gaussian. We will obtain the right answer 
with probability one. 
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3.2 Characterization of one-dimensional Gaussian distributions 



We start this subsection by stating the definition of analytic characteristic function which 
has been taken from 1201. 



Definition 3.3. A characteristic function <£> is said to be analytic if there exist 

• a complex valued function, <fi, of the complex variable z which is holomorphic in a 
circle {z : \z\ < p}, where p > 0, 

• a positive real number S such that $(£) = 4>(t), for \t\ < 5. 

That is, an analytic characteristic function is a characteristic function which coincides 
with a holomorphic function in some neighborhood of zero. 

Some properties on analytic characteristic functions may be found in [20]. In partic- 
ular, it is proved therein that the characteristic function of a Gaussian distribution is 
analytic (this is a well known fact). Some other well-known distributions having analytic 
characteristic function are the binomial, Poisson and gamma distributions but not the 
Cauchy one. 

The following result will be useful to assess that our goodness of fit test will work with 
all non-Gaussian alternatives. 

Proposition 3.4. Let P be a Borel probability measure defined on R. Assume that P is 
absolutely continuous with respect to the Lebesgue measure. Let Y be a r.v. having an 
analytic characteristic function $y . 
Then, Y is Gaussian if, and only if, 

3m 6l, 3s £ R + such that P{{y G R : $y(y) = $ m , s (y)}) > 0. (1) 

Proof. 

Necessary part is obvious. Let us show the sufficiency. As Y satisfies ([!]), and P is 
absolutely continuous, we have that the set R := {y G R : $y(y) = $ TO)S (y)} is infinite 
and not denumerable. Thus, it contains at least an accumulation point. 

Furthermore, the function y — > $y(y) — $ TO)S (y) is analytic, and it vanishes on R. 
Therefore, this function has a non-isolated zero but the only analytical function with at 
least a non-isolated zero is the null function which proves the result (see for example 

□ 



Proposition 3.4 may be seen as a spectral counterpart of Theorem 3.2 



3.3 Classical tests of Gaussianity for stationary processes 

Through this section we present some useful popular tests for checking whether a station- 
ary random process (Y t )tez, is Gaussian. 
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3.3.1 Epps test 

The test discussed in this section is a particular case of the one studied in Section 3 of 
[IT] . We begin with some notations and definitions. Given N > 1, let us define 

A,v := {A := (Ai, . . . , X N f G : A, ^ A, for all i ^ j, i,j = 1, ...N}, 

where T denotes transposition. 

Let A G Ajy and let g(X) be the 2A^-dimensional column vector composed by the real and 
complex parts of the empirical characteristic function computed at A. That is 

1 n 

'■= - y^( cos (^i^)> sin(AiFi), . . . , cos(AivFi), sin(A7vFi)) T . 

i=l 

Further let, for v G M real and p > 0, the 2iV-dimensional vector composed by the real 
and complex parts of <& up computed at A: 

g ViP {\) := (Re^^AO), Im($„ lP (Ai)), . . . , Re{$ V)P {\ N )), lm(^ p (X N ))f . 

We denote by /y(0, (/iy,7y), A) the spectral density matrix (see for example j2]) of the 
process 



(g(Y u A)) teZ := ((cos(AiF t ), sin(AiF t ), . . . , cos(X N Y t ), sm(X N Y t ))) 



i'-L 



at frequency 0. Notice that if we assume that (Y t ) t ^z is a Gaussian stationary process 
with 

E I*I C Ity(*)| < oo, for some C > 0, (2) 

tez 

then the existence of /y(0, (/iy, Jy), A) is one of the conclusions of Lemma 2.1 in [11]. For 
the construction of the test statistic we will use the following estimator of /y(0, (/iy, 7y), A): 



Ln 2 / 5 J 



71 — % 



/(o, a) = (27m)- 1 d{ y^ a) + 2 53 a - v l^ 2/5 j ) E d{ y^ A ) > ( 3 ) 



t=l i=l i=l 



where G{Y t> h A) = (fi'(^) A) — g(A))(g(Yj + i, A) — g(X)) and |_-J denotes the integer part. 
The estimator (|3| was used in [11], but with 2/5 replaced by a general constant in the 
interval (0, 1/2). Notice also that it is a particular case of the one proposed in [12]. In [11] 
it is proved that if (Y t ) te % is Gaussian, stationary and satisfies then /(0, A) converges 
almost surely to /y(0, (/iy,7y), A). Let G+(A) be the generalized inverse of 27r/(0, A) and 
let Qniy, P, A) be the quadratic form 

Q n (u, p, A) := (g(X) - g v , p {X)) T G+(A) (g(X) - g„, p {X)) . (4) 

Let be an open bounded subset of R x R + and let A G Ajy. We state two assumptions. 
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HI. The set 0o(A) := {(u, p) G 6 : & UtP (\i) = $ Mr)7y (Aj), z = 1,...,N} is nowhere dense 
in 6. 

H2. For each (is, p) G 6 (A) we have, /y(0, (v,p), A) = /y(0, (py,7y), A) and 



d(x,y) 



(x,y) = (u,p) d ( X i V) 



i = l,...,N. 



Theorem 3.5 below describes the Gaussianity test studied in [TT] . 



Theorem 3.5 (Epps (1987)). Let {Yt)t& be a stationary Gaussian process satisfying 
Let G be an open and bounded subset of R x IR + and A G such that HI. and H2. 
hold. Further, let (/i n ,7„) be the minimizer on nearest to (py,7y) of the map 

0,P) -> Qn(v,p, A). 

Assume further that /y(0, (py,7y), A) zs positive definite. Then, for each fixed A G A^, 
nQ n (^n,1n, A) converges in distribution to xIn-2- 



Remark 3.5.1. Obviously a test based on Theorem 3J3 may be not consistent. Indeed, 
it only focuses on the values of the characteristic function at some points. In other words, 
the test could not detect some alternatives with Gaussian one-dimensional marginal. Even 
the test fails against alternatives with non-Gaussian one-dimensional marginal but that 
satisfy that the characteristic functions of the one- dimensional marginal coincides with 
the one of the corresponding Gaussian at the selected points. 

3.3.2 Lobato and Velasco test 

The test to assess normality of time series discussed in this Subsection was introduced 
in [21]. It uses the skewness-kurtosis test statistic, also called Jarque-Bera test (see [6] 
and [18]), but improves previous tests of this kind because the statistic is studentized by 
standard error estimators. 

Given a process Y, let us denote Fk : = 2 Ym=i 7^(0(7^(0 + 7y( n — + 1y- This 

is an estimator of F k := YltL-oo lY(f) k - The test proposed in [21] handles the statistic: 

~ np Y3 n{p YA ~ 3/iy i2 ) 2 

Cry = — -~ h 



6F 3 24F 4 

Theorem 3.6 (Lobato and Velasco (2004)). Let {Y t )tei be an ergodic stationary process. 

• If '(It)tez is Gaussian and satisfies Ylt^o |7y(£)| < oo, thenGy — > y| ^ n distribution. 

• If (Y t ) teZ satisfies 

- E[Y t 16 } < oo, 

~ E^=oo" • •X)~_ 1 =oolM*i>->V0l < 00 > f or o=2,. ..,16, where k q (t 1: t ? _i) 
denotes the qth- order cumulant of Y\, Yx +tl , ...,Yi +tq _n 
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Ei=i[ E K E ( y o - v) h \F-t) - Atfc| 2 ] 1/2 < oo, for k = 3,4, where T_ t denotes the 
a -field generated by Yj, j < —t, and 

E[(y - fi) k -fi k } 2 + 2 £~ =oo E([(K - M) fe - //fc] [(^ - - /x fe ]) > 0, /or fc = 3, 4, 



i/ien t/ie statistic Gy diverges to infinity whenever /iy 3 ^ or yUy j4 7^ 3/x 



y,2- 



In Section [4] we will prove this theorem under lighter assumptions on the alternative. 
We will need the following recent result taken from [19J. It is an improvement of the 
well-known result in pp. 

Theorem 3.7 (Kavalieris (2008)). Let (Y t ) te i be a stationary process with the represen- 
tation 

00 00 00 

Y t = k(i)e t -i, \k(i)\ < 00, ik(i) < 00, E[e n ] = 0, where (e t ) are i.i.d.r.vs.. (5) 

i=l i=l i=l 

Assume that E[|e n | Q ] < 00 for some 2 < a < 4. If r n < cn 13 for < (3 < 1 and c > 0, t/ien 

max |7(t) — 7(i)| = o(n 2 ^ a_1 ) almost surely. 

0<t<T n 



3.4 Multiple testing 

In Section [5] we will propose to use several tests to assess the Gaussianity of a process. 
Thus we obtain several p- values pi, . . . ,p k , where k is the number of procedures used. 

The most popular way to handle several p-values is to use the Bonferroni correction. 
However, it is very well-known that this procedure is too conservative. Several alternatives 
have been proposed in the literature in order to alleviate this problem. Here, we will 
employ the false discovery rate (FDR). The FDR is the expected proportion of wrongly 
rejected hypotheses along the k tests. Taking into account that all the hypothesis we have 
are equivalent, the FDR coincides with the level of the procedure. 

The FDR was introduced in Benjamini and Hochberg [4] for independent tests. Here, 
we employ the improvement proposed in [5] that does not require dependence assumptions 
among the tests. This procedure, when applied to our case, works as follows: 

Theorem 3.8 (Benjamini and Yekutieli (2001)). Let us assume that we apply k statistical 
tests to check the same null hypothesis and that the ordered p-values that we obtain are 
P(i), ...,P(fe), where p {1) < ... < p {k) . 

Let a G (0, 1). The FDR of the test which rejects if the set 

is not empty is, at most, a. 
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Therefore, according to the previous theorem, if we denote 

k 

Po = kj^r 1 min p (i) /i 
• 4 — ' i=l,... ,k 
i=i 

we can reject at any level a > po and then, we can take po as the resulting p-value of the 
procedure. 

4 A Gaussianity test for stationary processes 

In this section we present a universal test to check if the distribution of a stationary 
process is Gaussian. Thus, given X := (X t )tez a stationary process of real-valued random 
variables we are interested in constructing a test for the null hypothesis 

H : X is Gaussian 

against the alternative 

H a : X is not Gaussian. 

Notice Hq holds if, and only if, for all t G N, (X\, . . . ,X t ) T is a Gaussian vector. As 
X is stationary, it is equivalent to the distribution of (X t )t<o is Gaussian. In addition, it 
is the sames as the Gaussianity of the process X^> := (Xj)j< t , for any t 6 Z. To check 
whether is Gaussian, we only need to include in an appropriate Hilbert space, 
then select a vector h using a dissipative distribution, and compute the scalar product 



(X^,h) because, according to Theorem 3.2, almost surely, X® is Gaussian if, and only 
if, (JW,h) is Gaussian. 

Concerning the Hilbert space in which the process is included, let us consider the space 
of sequences 



I 2 = < (x n ) neN : ^2,x 2 n a n < oo \ 

I nGN J 



with a := 1 and a n = n 2 , (n > 1) endowed with the scalar product 
(x,y) = y^x w y n a n , where x = (x n ) neN ,y = {y n) 

neN 



It is easy to see that if X is a stationary process and if the variance of X t is finite, 
then, almost surely, X® E I 2 and that the Gaussianity in this space is equivalent to the 
(usual sense) Gaussianity of X^>. The reason is that E[^2 n( ^q X 2 _ n a n ] is finite if it is so 
the variance of X t . 

Now we need a dissipative distribution on I 2 . We will use the so-called Dirichlet dis- 
tribution (see [2S])- We build this distribution using the so-called stick breaking method. 
That is, let «i,a 2 > and consider the probability distribution which selects a random 
point in I 2 according to the following iterative procedure: 
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• lo G [0, 1] is chosen with the beta distribution of parameters a% and a^. 

• Given n > 1, l n G [0, 1 — ^I=o ^] * s chosen with the beta distribution of parameters 
ax an d times 1 — Y^=o 

Let us define H n = (In/an) 1 ^ 2 for n > and take H = (jj p, ... ) r . It can be easily 
checked that the distribution of H is dissipative (see Definition 3.1). Moreover, H G / 2 



almost surely because, as shown in Proposition 4.1, ||H|| = 1, almost surely. 



Proposition 4.1. Lei H = (H n ) n >o be a stochastic process constructed as described above. 
Let a := a\/{a\ +012) be the mean of the beta distribution of parameters o>\ and a<i- Then, 
we have that 

1. E[l n ] = a(l - a) n , for every n G N*. 

2. ||H|| = 1, almost surely. 
Proof. 

Obviously [7| holds for n — 0. Thus, let us assume that [7J is satisfied for n G N* and 
let us show that it also holds for n+1. By construction, we have that if j3 is a random 
variable with beta distribution of parameters ai and 02, then 




E[Z n+ i] = E[/3] 1 - Ym] }= a 1 - Va(l - a)* = a(l - a 




\n+l 



where last equality comes from the application of the formula giving the sum of n numbers 
in a geometric progression. Concerning Q , we have that 



\\U\\=Y i Hfa i = Y i l i <l. (6) 

i=0 i=0 

Indeed, by construction, for every nGN, Y^=o k — However, applying [7|, we have that 

00 

E[||H||]= j>(l -«)* = !■ 



i=0 



So that, by ^ we obtain [1| □ 

Now, let h = (/ij)i £ N be a fixed realization of the random element H, drawn indepen- 
dently of the process X. Let us consider the process Y = (Y t ) te z given by the projections 
of (X®)t£z 011 the one- dimensional subspace generated by h, i.e. 

00 

Y t = J2 h i x t^uteZ. (7) 

i=0 



As we will see in Proposition 4.3, the properties of the process X are inherited by the 
process Y. Moreover, according to Theorem 3.2, to assess the Gaussianity of X is enough 
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to do the job on the one- dimensional marginal distributions of Y. This can be done for 
instance with Epps or Lobato and Velasco tests presented in Section 3^3 whenever Y 
satisfies the appropriate assumptions. The following Subsections are devoted to this task. 
We begin by proving Lemma |4~2 which is necessary for Proposition |4.3 



Lemma 4.2. Let X be an ergodic and stationary process such that Y2tLo \lx{t)\ < oo. If 
we select H as described above, then, 

1. X^o-^* a * < 00 a ^ m °st surely. 

2. Almost surely, the random variable L := ^°° =0 HiHjai(ij\X-i — (j,x\\Xt-j — fix\ is 
conditionally integrable given H. 

Proof. 

[7} It is straightforward because the Cauchy-Schwartz inequality gives that 



1/2 



1/2 



1/2 



1 + l/i 2 J < oo, almost surely, 



i=0 



.4=0 



i=l 



i=l 



where last equality comes from Proposition 4.1 



To prove [^., let h = (h , hi, . . .) be a fixed realization of H. We have that 

oo 

E[L\h] = ^2 hthjOiajEHX-i - (i x \\Xt-j ~ f*x\] 



i,j=0 



< hh^aMiX-i ~ /ix) 2 ]) 1/2 (E[(X^. - /ix) 2 ]) 1/2 =lx[J2 h 



i,j=0 



Thus, L is conditionally integrable thanks to [7J and that 7x < YltLo \lx(t)\ < oo. □ 

In the sequel jY\h.(t) denotes the conditional autocovariance of order t of Y given h. 
That is, denoting by fiy\h the conditional expectation of Y given h, 

7 y| h (t) := E[(Y - HY\h)(Y t - HY\h)H 

Proposition 4.3. Let(X t ) te % be an ergodic and stationary process such that YltLn^\'yx(t)\ < 
oo, with ( > 0. Then, conditionally on h, the process iXt)t& defined in |?]) is ergodic and 
stationary. In addition, E[|F ||h] and J2'tLo^\lY\h(t)\ are finite. 

Proof. 

(Xt)tez is a stationary ergodic process. So that, conditionally on h, (Y t ) te z is also a 
stationary ergodic process (see [H3] page 458). 
Using the definition of the process Y we have 



E[|Y | |h] < E 



S ^2h i a i \X. 



i=0 



E[|X |] hiOi < oo, a.s. 



i=0 
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because of 1. in Lemma [4.21 

By [2| in Lemma 4J2 , we have that 

oo 

7y| h (t) = E[2j hihjdia^X-i - ^ x ){X t -j - Hx) |h] 

i,j=0 

exists. So that, using the dominated convergence theorem, we obtain that 

oo 

lY\h{t) = ^2 hihjO'iO'jlxit -j + i) 

i,j=0 

and 



t=0 i,j=0 t=0 

Obviously, Y.Z=o h i h j a i a j E^o* C l"fx(* ~ j + i)\ =■ T ± + T 2 + T 3 , where 

• T x = J2f= hjdj J2Zj h i a i J2Zo tC \lx{t - j + 

• t 2 = J2T=o h i a i ECS h i a i E t =2i+i tC \ix(t + 



T 3 = E^= h i a i ELo 1 ^ E*io tC hx(t -j + 



2.7 



If % > j, as t e N* and C > 0, we have t c < (t - j + Thus, 



oo oo 



3=0 i=j i=o i=o 



t=0 



because t — j + i > t. Then, Et=o^l7-x(*)l < 00 an d so, 1. in Lemma 4.2 implies T\ < oo. 

Concerning T 2 , as j > z and t — j + i > 0, we can apply the c^— inequality (see 
[22] p. 157) to t = (t — j + i) + (j — i) to obtain that there exists > such that 
* c < c f (t - j + i) c + c c (j - z) c < 2c c (t - j + i)f . Thus, 

oo j— 1 oo oo j—l oo 

?2 < 2c c fej-Oj hittj } j (t-j+i) c \j x (t-j+i)\ < 2c c y^ j h j a j y^ j h i a i y^ j t <: \-fx(t)\. 

j=0 j=0 *=2j'+l 3=0 i=0 t=0 

Then, using the same tricks as for Ti we obtain that T 2 < oo. 

For T 3 , the fact that Y^tlo^\lx{t)\ < oo, implies that there exists an R > such that 
\jx(t)\ < R for all t e Z. Therefore, 

/ oo \ oo / oo \ 



,i=0 



3=0 



J=0 
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By 1. in Lemma 4.2 to show that T3 < 00, we only need to prove that T| < 00. 



Furthermore, applying Jensen inequality and[7| in Proposition 4.1 we have that 



E[T 3 *] < £ a) /2 (2j) C (2j + l)a l ' 2 (l - a)^. (8) 
i=o 

This last series is convergent (a G (0, 1)). Hence, T3 is finite almost surely and the proof 
is ended. □ 

4.1 Conditions to apply Epps test 

In this subsection we analyze the theoretical behavior of Epps test when applied to the 



randomly projected process (see Theorem 4.7). Moreover, in a corollary (Corollary 4.8) 



we will show that if A is drawn randomly, then the Epps test is consistent against many 
more alternatives. 



Let us first state Lemma AA that gives the consistency for the estimator of the spectral 
density function at zero defined in Let us denote by ki mno (q, r, q+r; A) the fourth-order 
cumulant of Z 0t i, Z q>m , Z r>n , and Z q+r>0 , where, for instance, Z q ^ m is the m-th component 



of the vector g(Y q , A) — g^ YlY {\) (see Subsection 3.3.1). 



Lemma 4.4. Let A G A^v- If Y is a stationary process such that 

sup ^ \hmno(q,r,q + r;X)\ < 00 for eachl,m,n,o G {!,... ,N}, (9) 



-oo<g<oo 



then, /(0,A) — > /y(0, (/iy,7y), A) almost surely. 
Proof. 

It is straightforward from the proof of Lemma 2.2 in [11] but substituting by (|9| the use 
of ^ and Gebelein inequality, [14J, for Gaussian processes. Gebelein inequality says that 
the autocovariance of a multidimensional process is smaller or equal than the product of 
variances of the marginals. □ 

Lemma 3.1 in [11] proves that if Y is a stationary Gaussian process that satisfies 
(J2J) , then ([9]) holds. In [23], Gebelein inequality is extended to two-dimensional vectorial 
processes with diagonal densities. So that, any stationary process that satisfies ^ and 
whose two-dimensional marginal has diagonal density, also satisfies ([9]). 

Let be an open and bounded subset of IR x M + . In |llj . it is proved that HI and H2 



see Subsection 3.3.1 ) are satisfied if Aj is equal to a rational number times Ai, i = 2, N. 



Now, thanks to Lemma |4.5| below, we have that A can be taken at random and still fulfill 
HI and H2. 

Lemma 4.5. Assume that A = (Ai, . . . , Atv) t G A^v (N > 1) is drawn randomly and 
has distribution P^ having the following properties. First P^ is such that Ax and A2 are 
independent and identically distributed and have a density. Further, for N > 2, Aj is a 
rational number times X±. Then, HI and H2 are fulfilled almost surely. 
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Proof. 

Proceeding as in [11] we have that 

©o(A) Q {{v, 7y) : vX\ = /t-tyAi + 2nk and z^A 2 = /iyA 2 + 27rA;*, with k, k* <G Z}. 

Now, in order to get that the cardinal of 0o(A) is larger than one, we need A 2 is equal 
to a rational number times Ai. However, this happens with probability zero and so, with 
probability one 6 (A) C {(fi Y ,lY)}- Thus, HI and H2 follow directly. □ 



Note that in case N > 1 Lemma |4.5| remains valid if we draw independently at random 
Ai, i = 3,...,N. In addition, thanks to this lemma we have the following corollary of 
Theorem 13.51 



Corollary 4.6. Let (Y t ) te x be a stationary Gaussian process which satisfies M) and A be 



as in Lemma 4-5 Let (//„, 7 n ) be the minimizer on 9 of the map (is, p) — > Q n (v, p, A) near- 
est to (fi,j) If we assume that /y(0, (/iy, 7y), A) is positive definite, then nQ n (p ni ^ n) X) 
converges in distribution to xLv-2- 

In the next theorem, the function Q n also depends on the random h. However, for the 
sake of simplicity we have not express the functional relationship. 

Theorem 4.7. Let X be an ergodic stationary process satisfying M). Draw respectively 



A as in Lemma 4-5 and h independently of X using Ph (as described above). 

Assume that, conditionally on h ; Y defined in ([?]] satisfies that the characteristic 
function of its one- dimensional marginal is analytic and that /y|h(0, {py\h, 7y|h), A) exists 
and is positive definite for almost every h. Let Q n (-,-,X) be the quadratic form defined 
in applied to Y and (/i n ,7„) its minimizer on nearest to (fiy\h, 7v|h)- Let further 
A := {(X,h) : nQ n (p ni ^ n) X) -^-d a non-degenerated distribution}. 

Then, X is Gaussian if, and only if, (P\ ® Pj£)\A\ > 0. 

Proof. 

Necessary part is obvious, because if X is Gaussian, then Y also is Gaussian and Propo- 



sition |4.3| implies that Y satisfies the assumptions of Corollary 4J3 

Let us show the sufficient part. As (P\ (g> Ph)[^4] > we have that there exist h and A 
with Ai 7^ and A 2 ^ such that nQ n (p ni ^ ni A) converges in law to a non-degenerated 
distribution. In addition, we may assume without loss of generality that $y (Ai) ^ 
and <3?y (A 2 ) ^ 0. Indeed, as <3>y is an analytic characteristic function it has only isolated 
zeros. 

Therefore, Q n (p n , 7„, A) converges in probability to zero. By Lemma |4~4| /(0, A) converges 
to /y|h(0, (/iy|h, 7y|h), A). Thus, lim n G+ is positive definite because it is the inverse of 
27r/Y|h(0, (/iy|h, 7y|h), A). This together with Q gives that 



$(A)-ftw*(A)->c.p.O. (10) 

Since X is an ergodic stationary process, by [10J page 458 we have that (g(Y t , A)) t6Z is 
also an ergodic stationary process. Thus, as E| cos(Ajlo)| < 00 an d E| sin(AjFo)| < oo for 



14 



all % = 1, N, we have by Theorem 2 of Chapter IV in [16] that 

g{\) - c .p. E[^(y 0) A)]. 



From this and (10) we may conclude that $ /ln7n (Ai) converges in probability to $y (Aj 
(* 1 V). 

Let us see how this implies that the sequence {(/x n , 7n)}neN converges. We have that 



lim l $ /,„,7n( A i)l = lim e 



-Ai7„/2 



|$y (Ai)|, in probability, 



such that 



and, since Ai 7^ and <Ey (Ai) 7^ 0, this implies that there exists s G 
s = lim^oo 7 n in probability. Note that there exists 9 G [0, 27r) such that 

$y (A 1 ) = |$ i1) (A 1 )|exp(^). 

As Ai 7^ 0, if we take m := 9/Xi, then, we have that $y (Ai) = $ m ,s(Ai). 

Analogously, we have that |$y (A2)| = lim^oo e~ A27 ™/ 2 , in probability, and as s 
lim^^oo 7„ we obtain 

l$Yo(A 2 )| = 

Denoting r = A2/A1, we obtain that 



-Aas/2 



;n) 



l^o(A 2 )| 



lime 1 



l*Vo(Ai)| 



rAir 



Together with (|TTj) this gives $y (A 2 ) = $m, s (A 2 ). 

As A2 was drawn independently from Ai with a distribution absolutely continuous with 
respect to the Lebesgue measure and as <3>y is analytic, by Proposition 3.4 we get that 
Yq is Gaussian. Then, by Theorem we obtain that the process X is Gaussian. □ 

Remark 4.7.1. It is only necessary to assume that X is ergodic to prove the inverse part 
in Theorem 4.7 since every stationary Gaussian process which satisfies ^ is ergodic. (see 
for example [TT] ) 



Applying the arguments of Theorem |4.7| directly to the process X, we obtain the 
following corollary. It gives a modification of Epps test with better consistency properties. 

Corollary 4.8. Let X be an ergodic stationary process. Assume that the characteristic 
function of its one- dimensional marginal is analytic. Assume further that holds. Let 
us take A as in Lemma 4-5, Q n (-, A) as in Q) ; let (/i n ,7„) be its minimizer on nearest 
to (fix, Jx) an d 

B := {A : nQ n (fj, n ,'y n , A) -^d a, non-degenerated distribution}. 

If we assume that /x(0, (fix, Ix), A) exists and is positive definite, then, X is Gaussian 
if, and only if, P\(B) > 0. 
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Remark below can be obviously deduced from Theorems 3.5 and |4.7| This remark 



allows to perform a test based on the asymptotic distribution of nQ n (« n ,7 n , A). 



Remark 4.8.1. Theorem 4/7 and Corollary |4 . 8 1 remain valid if we change in the definition 
of sets A and B "non-degenerated distribution" by "chi-squared distribution with 2(iV — 1) 
degrees of freedom" . 

In addition, we have the following corollary. 



Corollary 4.9. Under the assumptions of Theorem ^.7, (P\ <g> Pjj)[A] G {0, 1} and X is 

Gaussian if, and only if, (P^ ® P H )[A] = 1. 



Analogously, under the assumptions of Corollary 4-8, P\(B) G {0, 1} and X is Gaus- 
sian if, and only if, P\(B) = 1. 

4.2 Conditions to apply Lobato and Velasco test 

In this subsection we show that a slight modification of the statistic Gy satisfies Theorem 



3.6 under different assumptions than the ones used in [21] . 
The test statistic is 

nfi\ n(/x 4 - 3/x|) 2 

Cry — 7. 1 7, 

6|P 3 | 24|P 4 | 

with 

F k = 2 jr + 7(r n + 1 - t))"- 1 + 7 fc , 

where, according to Theorem |3.7[ we take r n < en* for (3 = 1 — 2/a, c > and 2 < a < 4. 
Thus, the differences between Gy and Gy are the absolute values in the denominator and 
the number of terms involved in F/.. 

Theorem 4.10. Let (X t )tez be an ergodic and stationary process such that YltLo \lx{t)\ < 
oo. We have that 

1. If (Xt)tez is a Gaussian process, then Gy — >d X%- 

2. If (X t — ixx)t& can be written as and E[Xq] < oo, then, conditionally on h, Gy 
diverges almost surely to infinity whenever u 3 ^ or /i 4 ^ 3/^. 

Proof. 



Using Proposition 4.3 for ( = we get that (Y t )tez is an ergodic and stationary process 



with E*ol7(*)l < °°- 

If (X t )tez is Gaussian, the process (Y t ) te % is also Gaussian. Thus, assumptions of 



the first part of Theorem 3.6 hold for the process (Y t ) te z and so Gy — >d x\. Now, 
as Y is Gaussian, by [13] page 568, we have that F k > for k = 3,4. Repeating the 
proof of Lemma 1 in [21], we have that lim^oo F k = F k and so, we may conclude that 
lim^oo Gy = lim^oo Gy which shows 1. 

Let us prove now statement p| First, let us show that E[|F| fc |h] < oo, almost surely, for 



1G 



k — 1, ...,4. By Hdlder inequality we have that \Y \ < (E*=o a *) 1/2 (D=o h^X^) 1 / 2 and, 
as by Proposition 4.1 Yli^o^l^ = 1> almost surely, we can apply Jensen inequality We 
obtain that 

Yq < I <2j J I h^ctiXti J , almost surely. 



,i=0 



. i=Q 



Thus, E[|lo| fc |h] < 00, almost surely, for fc = 1,...,4. By [TP] , page 458, we have that 
(Yf) t z is stationary and ergodic, for all fc = 1,...,4. Therefore, Theorem 2 of Chapter 
IV in [16J implies that 



lim fik = jik, for almost every h and fc — 2, 3, 4. 



(12) 



Further, let us prove that lim^oo \Fj~\ < 00 for almost every h and k = 3,4. We have 

T„ fc— 1 



t=i 3=0 



fc- 1 



7y(<) fc_i 7y(rn + l-0 J "- 
Taking into account that |a fc ~- , '& J '| < |a| fc + \b\ k , with e N, j G N and j < fc, we have 

Tn 

\F k \ < \%\ k + 2 k ^(| 7 y(t)| fc + |7y(r n + 1 - t)\ k ), 
t=l 

and then we obtain \F k \ < 2 fc+1 (^[™ l7y(*)l) fc - Let us prove now that 

lim V \jy{t)\ < 00. 

n. — »r>n ' * 



t=o 



Note that as E[JYq] < 00, we also have 

00 4 

00 > E[(X - fi X ) 4 } = H k (jr)E 



ji,...J 4 =l r=l 

00 



n 

r=l 



n-j r 



E^]J]fc(j) 4 + E[e?] 2 £ fc« 2 fc(j) 2 



because (e„) are i.i.d.r.vs. with E[ei] = 0. So that E[ef] < 00. Further, using Theorem 3.7 
we obtain that 



£(l7x(t)|-| 7 x(t)|) 



t=o 



<(r n + l)o(n 2 /"- 1 ) =o(l). 



Thus, lini n^oo Ylt=o \lx{t)\ < 00 • Then, by proceeding similarly as in the proof of Propo- 
sition 4.3, we get lim^oo YH=q < 00 and so > li m n->oo < 00 for fc = 3,4. Using 
(12) we may conclude that 2. holds. 



□ 
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Finally, applying Theorem 4.10| directly to the process X, we obtain the following 
corollary. 

Corollary 4.11. Let (X t ) te % be an ergodic and stationary process such that YltLo \lx{t)\ < 
oo. We have that 

1. If (Xt)tez is a Gaussian process, then Gx — ►<* X%- 

2. If (X t — Hx)t& can be written as ^ and "E[Xq\ < oo, then, conditionally on h, Gx 
diverges almost surely to infinity whenever ^jl x ,z 7^ or jj, x ,i ^ 3[i X2 . 

5 The tests in practice 

In this section we discuss the practical implementation of the gaussianity test. We start 
doing some remarks on Epps test. 

5.1 Remark on Epps test 



Although Theorem 3.5 works for any A G A^r, with N > 1, which satisfying HI and H2, 
in [TT] it is stated that: 

• When either N is large or the spacing between the Xj is small, relative to the scale 
of the data, the matrix 2w f(0, A) often appears computationally singular. 

• Also, values of Xj which are large, relative to the scale of the data, makes difficult to 
find a minimum of Q„ (-, •, A) with much precision. 

Epps suggests to take 

Ai = £//^, with£>0,j = l,...,iV. (13) 



Recall that 7 denotes the sample variance of the process. He proved that Theorem 3.5 
works taking such A. In the simulations of Epps, and also in the ones of [21], N = 2 and 

(6,6) = (1,2). 



However, we need to draw A randomly in order to have a consistent test (Theorem 4.7). 
So, we take N = 2, £1 distributed as the absolute value of a standard normal distribution 
and 6 distributed as the absolute value of a normal distribution with mean zero and 
variance 4. With this selection, although seldom, we have found that /(0, A) could be 
singular. This is the main reason to choose G^(X) as the generalized inverse of 27r/(0, A). 

Another important practical issue is the procedure used to find the minimizer nearest 
to (£l, 7) of the map (z/, p) — > Q n {v, p,A). In the simulations of [TT] and [21] they use the 
simplex method developed in [25]. We did the same. The code of such method can be 
found in [27] under the name amoeba. 
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5.2 The random projection procedure to test Gaussianity 

The theoretical development of Section [4] was carried out assuming that the observed 
sample is infinite. However, in practice, only a finite number of measurements X , . . . , X n 
are available. So that, only a finite number of components of h are computed. This last 
difficulty is handled by fixing a small 5 > (equal to 10~ 15 in the simulations that we 
present in Section [6| and by taking h = (h , . . . , h m ) T with 



m 



min{min [t : || (h , . . . , h t ) T \\ > 1 — 5} , n — 1}, 



where h , . . . , h m -\ are drawn by the stick breaking procedure described in Section |4j Fur- 
ther, h m is fixed such that ||h|| = 1. Concerning the projected process, some possibilities 
are available, but here we use 

min(m,t) 

Y t = ^ hiXt^idi, t = 0, . . . , n. 

Let us give a short comment on the choice of the parameters a,\, a<i > of the beta 
distribution used to generate h. Here we have to deal with the following situation: If m 
is large, then the random variables Y t are linear combinations of many random variables 
from the first sample and then, because of the Central Limit Theorem, the distribution of 
the random variables Y t will become closer to a normal law. That will cause some loss of 
power when the marginal of X is not Gaussian. Thus, in order to detect a non-Gaussian 
marginal, it is wise to select a\ and a 2 in such a way that m is small or even or 1. This 
goal is achieved if we take a 2 = 1 and a,\ 1. Our selection in Section [6] is a,\ = 100. 
However, in this case the samples Y , . . . , Y n and X , . . . , X n are quite similar. So that, 
the test will not be good in detecting non-Gaussian alternatives with Gaussian marginal. 
In order to overcome this problem we should take h in such a way that the projections 
mix several variables from the initial sample. To achieve this goal we need to take «2 > ai 
but with «2 being not too big to avoid the effect of the Central Limit Theorem. In this 
case, a selection like a± = 2 and «2 = 7 seems appropriate. Therefore, it seems that in a 
practical situation we should decide which alternative is more plausible and then, select 
the appropriate parameters. However, there is another possibility: select two projections 



(one with each pair of parameters) and apply Theorem 3.8 to mix the p-values. This is 
our proposal. 

Finally, we need a Gaussianity test for the one dimensional marginal of (Yq, . . . , Y n ). 
We have seen two such tests (which have some advantages and disadvantages discussed in 
Section [6]) and we can also mix them. Having all these requirements in mind, we propose 
the following procedure: 

1. Draw with the /3(100, 1) distribution and apply Epps test to the projections to 
obtain the p- value 

2. Draw (independently of h^) with the (3(100, 1) distribution and apply Lobato 
and Velasco test to the projections to obtain the p- value p^ 2 \ 
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3. Draw h^ 3 ** (independently of and h^) with the /3(2,7) distribution and apply 
Epps test to the projections to obtain the p- value p( 3 \ 

4. Draw h^ 4 ** (independently of h^, and h^) with the (3(2,7) distribution and 
apply Lobato and Velasco test to the projections to obtain the p- value p^ 4 \ 



to 



5. Combine the p- values p^\ . . . ,p^ using the procedure described in Section 3.4 
decide the Gaussianity hypothesis at the level a. Thus, ordering these four p- values 
such that p(i) < ... < P(4) we obtain that the p- value of the random projection test 
is equal to (25/3) • minj = i v ..^p^ji. 

6 Simulations 

In this section we study the behavior of the proposed procedure in different situations. 
We have used the same distributions as in [21], in order to perform comparisons. Further, 
we will study a situation where the process has Gaussian marginal but is not Gaussian 



(see Section 6.1). In addition, in Subsection 6.2 we apply the random projection test to 
real data. 

The authors of [2T] study the case of an AR(1) process depending on a parameter q 
defined by 

X t = qX t . 1 +e u (14) 

where q G { — .9, — .5, 0, .5, .6, .7, .8, .9}, t G Z and e t are i.i.d. random variables with 
distribution D £ which may be any of the following ones: 

• standard normal (N(0,1)), 

• standard log-normal (log N), 

• Student t with 10 degrees of freedom, (tio), 

• chi-squared with 1 (xl) an d 10 degrees of freedom (xio)) 

• uniform on [0,1] (17(0,1)), 

• beta with parameters (2, 1) (/3(2, 1)). 

To simulate the process, we generate a large number of independent realizations e t , t = 
1, . . . , M with distribution D e and we take 

• X 1 = e 1 

• X t = qX t - 1 +e t ,t = 2,...,M 

It is obvious that if q ^ 0, this process is not stationary. For instance, Var[X t ] = 
Var[ei](l — q 2t )/{l — q 2 ) which is not constant and, obviously, the differences increase 
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with \q\. In order to alleviate this problem, we disposed a certain number, past, of ob- 
servations. We have taken past equal to 1000 and n = M — past equal to 100, 500, 1000, 
which are the sample sizes handled in \21\ . 

We have performed 5000 simulations in each situation. In every run we have computed 
the p-values using the asymptotic distributions. This could have caused that sometimes 
the rejection rates under the null hypothesis are far away from the nominal level (mostly 
for the lowest sample size n = 100) and that they decrease under some alternatives with 
the sample size (mostly for high values of \q\). 

There are some differences between our rates and those published in [21]. We think 
they could be due to the fact that the past taken in (21] is not large enough. For example, 
in the case n = 100, q — .7 and D s being (3(2, 1) we obtain a rejection rate of .2214 when 
using Epps test while in [21] they obtain one of .080, which is appreciatively worse. As 



explained before, our simulations were made with past= 1000, but from Table [6TT] we see 
that .080 is a rejection rate reasonable for past = and that the rejection rates increase 
with past, approaching to the value we have obtained. 



past 


1 


2 


10 


rejections 


.0750 .1378 


.1998 


.2210 



Table 6.1. Rejection rates along 5,000 simulations for different past, with Epps test, 
n = 100, D £ a (3(2, 1) and q = .7. 

We have observed the same problem with Lobato and Velasco test, excepting that with 
this other test our rejection rates are lower than those reported in [2T] . 

Furthermore, another difference to bold between what we do here and [21] is that in 



Subsection 4.2 a sum until r n is involved in the estimation of Fk while in |2T] the sum 
goes until n — 1, where n is the sample size. Here, we have to take r n < cn 130 , where 
(3 = 1 — 2/ a with 2 < a < 4 and c > 0. Thus, (3 may be as close as desired to .5 and 
so, we have decide to fix its value at (3$ = .5 for the simulations. In order to select the 
right value of c, we have made a small analysis to see how sensitive is the method to this 
parameter. We run Lobato and Velasco test under the null hypothesis for all values of 
q and c = l,2,...,c„, where c n = [y/n\ and n = 100,500,1000. Therefore, cioo = 10, 
C500 = 22 and C1000 = 31. The results suggest that the value of c has not much influence 
in the rejection rates and so, we choose c = 1. The results for the cases q = and q = .5 
appear in Figure [Tj It is worth saying that the situation for q = —.9 is a bit different 
than for the other values, as with q = — .9 the rejection rates look constant till a point in 
which those rates strongly decrease. 



Tables 6.2 , 6.3 and 6.4 contain the rejection rates for several procedures when applied at 
the level .05. Next we mention the procedures we have selected and make some comments 
on the results of our simulations. 



1. Epps test, E-test. We take (£1,6) = (1,2) in ([13]). 

It seems that this test behaves poorly when D £ is tio- Moreover, broadly speaking, its 
power decreases for the considered alternative distributions when \q\ increases, having 



21 



q=0 



0.05 



* y y * * * y y y y y y f~f "f -y ^y jj^jpz f * * * * * * * * 



CD u - u 

c 0.03 
o 



<D 
CD 



0.01 



-* — n=1000 
H — n=500 
n=100 



10 



15 20 

c 

q=.5 



25 



30 



35 



0.05 



CD u - u 

c 0.03 
o 

I 0.0 

CD 



4-^l_H-H-H^N--H--M--M~S-H^~~N-H--)-H- 



0.01 



-# — n=1000 
H — n=500 
n=100 



10 



15 20 

c 



25 



30 



35 



Figure 1: Rejection rates under the null hypothesis for an AR(1) process with q = (upside graph), and 
q = .5 (downside graph), using Lobato and Velasco test, for different values of c and sample sizes. 

low powers when \q\ = .9. Note also that under the null hypothesis (excepting the 
case q = with n = 1000), the rejection rates are above the level of the test and 
that they increase with \q\. 

The power decreases when the sample size increases in the cases in which |g| = .9 and 
the alternative is t 10 , x^ , U(0, 1) or /3(2, 1) (and even with q = .8 when D £ = t w ). 

2. Lobato and Velasco test, G-test. The rejection rates displayed have been sim- 
ulated using the statistic Gx- However, they are similar to those obtained using 
Gx- 

The G-test has very low powers when \q\ is large, sometimes even lower than those of 
the E-test. In addition it suffers from a lack of power when D e is U(0, 1) or f3(2, 1). 
The rejections under the null hypothesis are above the level of the test only in 4 
cases out of 24. In contrast with the E-test, here the rejection rates under the null 
hypothesis decrease when q increases. 

3. Combined Epps and Lobato and Velasco test, GE-test. In previous para- 
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graphs we have commented some problems of the E and G tests which go, let us 
say, in opposite directions. In order to solve these problems we combine both tests, 



using the multiple testing procedure presented in Section 3.4 



As stated in Subsection |5.1[ the GE-test have been obtained by drawing indepen- 
dently £1 with absolute value of a standard normal distribution and £2 with absolute 
value of a normal distribution with mean zero and standard deviation 2. However, 
it is worth noting that the rejection rates we have obtained have been a bit larger 
than in the case we take (^1,^2) = (1)2). 



We can observe from Tables |6.2[ |6.3| and 6A that this combination gives rejection 



rates between those of the E and G-tests although closest to the highest one, and, 
sometimes, even above. This is due to, as we have previously said, the rejection rates 
of E are here a bit larger than when (£1,^2) — (1)2). 

4. Random projection test, RP-test. We apply this test following the guidelines 



provided in Subsection 5.2 



When q is negative and we are under the alternative, we always get the highest 
rejection rates with the RP-test. The most striking behavior of this test happens 
for q = .9 and D £ = xfo an d (3(2, 1), where the RP-test obtains rejection rates larger 
than 0.8 while the second more successful test remains below 0.25. For the remaining 
values, it happens that the rejection rates using the RP-test are between the rates 
obtained with the E, G and GE tests but closer to the highest than to the lowest. 
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q 


Test 


N(0,1) 


log N 


^10 


xl 


9 

Xlo 


U{0, 1) 


0(2,1) 




E 


.1264 


.0508 


.1104 


.0656 


.1124 


.1390 


.1354 


-.9 


G 


.0292 


.1414 


.0310 


.0840 


.0332 


.0290 


.0266 






.0942 


1 A OO 

.14// 


nnno 

.0908 


.1072 


.0920 


.1020 


.1010 




RP 


.1380 


.8070 


.1742 


.7576 


.3076 


.2620 


.3902 




E 


.0724 


.6780 


.0556 


.8514 


.2058 


.5408 


.4914 


-.5 


G 


.0504 


.9986 


.1692 


.9986 


.4602 


.0102 


.1696 






.0774 


.9976 


.1582 


.9972 


A ET CO 

.4552 


A A C A 

.4454 


A 1 £ A 

.4154 




RP 


.0752 


.9998 


.1980 


1 


.5824 


.6404 


.7460 




E 


.0632 


.9616 


.0830 


.9964 


.5372 


.9918 


.9704 





G 


.0458 


1 


.2820 


1 


.7898 


.5404 


.7520 




Kjrhj 


htoo 

.0732 


1 


o a no 
.2402 


1 


.8074 


.8596 


O r 7f\P 

.8706 




RP 


.0772 


1 


.2288 


1 


.7640 


.8496 


.9054 




E 


.0682 


.8594 


.0608 


.9582 


.2610 


.5618 


.5562 


.5 


G 


.0384 


.9990 


.1696 


.9982 


.4118 


.0010 


.1102 






.0642 


nnnn 

.9990 


1 A A A 

.1444 


.9988 


.4700 


.4680 


A OO O 

.4882 




RP 


.0750 


.9908 


.1132 


.9880 


.5226 


.3256 


.7500 




E 


.0710 


.6118 


.0582 


.8106 


.2006 


.3462 


.3650 


.6 


G 


.0358 


.9884 


.1162 


.9772 


.2858 


.0012 


.0592 






.0640 


.9882 


.1144 


no oo 

.9832 


OO 1 o 

.3218 


nonn 

.2800 


.3086 




RP 


.0802 


.9536 


.1030 


.9262 


.5164 


.2580 


.7744 




E 


.0838 


.3250 


.0626 


.4640 


.1492 


.2032 


.2214 


.7 


G 


.0260 


.9076 


.0814 


.8196 


.1610 


.0036 


.0334 






.0714 


r\r\ A O 

.9042 


.0866 


Q A AO 

.8448 


.1998 


i ao a 

.1634 


i ono 

.1802 




RP 


.0784 


.8022 


.0926 


.7010 


.5754 


.2902 


.8060 




E 


.1034 


.1552 


.0810 


.2004 


.1324 


.1620 


.1596 


.8 


G 


.0206 


.6146 


.0466 


.4406 


.0708 


.0046 


.0166 




GE 


.0726 


.6118 


.0796 


.4488 


.1122 


.1154 


.1136 




RP 


.0896 


.4928 


.0932 


.3264 


.6766 


.3950 


.8782 




E 


.1752 


.1264 


.1618 


.1368 


.1612 


.1870 


.1680 


.9 


G 


.0106 


.1558 


.0094 


.0714 


.0150 


.0054 


.0086 




GE 


.1074 


.1844 


.0968 


.1190 


.0980 


.1182 


.1072 




RP 


.1168 


.1982 


.1174 


.1338 


.8702 


.6788 


.9662 



Table 6.2. Rejection rates at level .05 of a process defined by (14)- Sample size n = 100 
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q 


Test 


N(0,1) 


log N 


^10 


xl 


Xw 


[/(0,1) 


0(2,1) 




E 


.0744 


.3720 


.0584 


.2162 


.0712 


.0918 


.0850 


-.9 


G 


.0708 


.8838 


.0840 


.6202 


.1142 


.0462 


.0754 






.0780 


QPf\ A 

.8604 


Ann A 

.0924 


r a r\r\ 

.5400 


.1116 


.0866 


.0952 




RP 


.0810 


.9990 


.2260 


.9928 


.6924 


.4630 


.6918 




E 


.0594 


1 


.1334 


1 


.7730 


.9924 


.9922 


-.5 


G 


.0472 


1 


.4580 


1 


.9960 


.9656 


.9976 






n A 

.0470 


i 

1 


.3784 


i 

1 


.9912 


.9514 


nni a 
.9914 




RP 


.0490 


1 


.5090 


1 


.9998 


.9946 


1 




E 


.0566 


1 


.3292 


1 


.9982 


1 


1 





G 


.0480 


1 


.7428 


1 


1 


1 


1 






.0510 


i 

1 


.6756 


i 

1 


1 


1 


i 

1 




RP 


.0554 


1 


.6188 


1 


1 


1 


1 




E 


.0654 


1 


.1476 


1 


.8808 


.9918 


.9960 


.5 


G 


.0454 


1 


.4340 


1 


.9972 


.9704 


.9988 






.0516 


i 

1 


OQ 1 p 

.3816 


i 

1 


nno a 

.9924 


.9504 


.9962 




RP 


.0618 


1 


.2656 


1 


.9610 


.7440 


.9634 




E 


.0566 


.9998 


.1026 


1 


.7084 


.8286 


.9090 


.6 


G 


.0470 


1 


.3336 


1 


.9582 


.4678 


.8858 






.0570 


i 

1 


.2692 


i 

1 


no oo 

.9388 


.6944 


.8870 




RP 


.0610 


1 


.1794 


1 


.8604 


.4730 


.9006 




E 


.0708 


.9996 


.0786 


1 


.4704 


.4042 


.5810 


.7 


G 


.0474 


1 


.1970 


1 


.7592 


.0644 


.4040 






.0598 


i 

1 


1 p 7n 

.1670 


1 


'7000 

.7332 


OP A C\ 

.3640 


.5768 




RP 


.0702 


1 


.1282 


1 


.6986 


.2616 


.8786 




E 


.0776 


.9780 


.0710 


.9638 


.2500 


.1948 


.2564 


.8 


G 


.0744 


.9998 


.0976 


.9980 


.3908 


.1524 


.2628 




GE 


.0702 


.9998 


.1102 


.9978 


.3972 


.1848 


.2960 




RP 


.0710 


.9986 


.0910 


.9908 


.6834 


.2484 


.9208 




E 


.1156 


.5708 


.0944 


.4674 


.1526 


.1430 


.1560 


.9 


G 


.0232 


.8356 


.0370 


.5404 


.0764 


.0138 


.0336 




GE 


.0802 


.8708 


.0838 


.6378 


.1490 


.1092 


.1390 




RP 


.0860 


.7996 


.0770 


.5510 


.8430 


.4818 


.9772 



Table 6.3. Rejection rates at level .05 of a process defined by (14)- Sample size n = 500 
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q 


Test 


N(0,1) 


log N 


^10 


xl 


9 

Xlo 


[/(0,1) 


0(2,1) 




E 


.0648 


.7836 


.0578 


.4572 


.0826 


.0888 


.0942 


-.9 


G 


.0902 


.9934 


.1206 


.8932 


.2448 


.0760 


.1358 




GE 


noon 

.0880 


.9856 


.1002 


.8560 


.2190 


1 f\f\ A 

.1004 


.1450 




RP 


.0940 


1 


.3344 


.9998 


.8686 


.5876 


.8056 




E 


.0530 


1 


.2574 


1 


.9764 


1 


1 


-.5 


G 


.0436 


1 


.6778 


1 


1 


1 


1 




GE 


.0450 


1 


Pf\ A f\ 

.6040 


i 

1 


1 


1 


i 

1 




RP 


.0378 


1 


.7498 


1 


1 


1 


1 




E 


.0490 


1 


.5946 


1 


1 


1 


1 





G 


.0546 


1 


.9364 


1 


1 


1 


1 




GE 


C\ A OP 

.0486 


1 


.9162 


1 


1 


1 


1 




RP 


.0422 


1 


.8734 


1 


1 


1 


1 




E 


.0550 


1 


.2534 


1 


.9966 


1 


1 


.5 


G 


.0482 


1 


.6788 


1 


1 


1 


1 




GE 


n A O A 

.04/4 


1 


Pf\ 1 P 

.6016 


i 

1 


1 


1 


1 




RP 


.0484 


1 


.4348 


1 


.9994 


.9738 


.9996 




E 


.0566 


1 


.1718 


1 


.9580 


.9800 


.9974 


.6 


G 


.0472 


1 


.5112 


1 


.9996 


.9724 


.9996 




GE 


f\ AP A 

.0464 


1 


A OO A 

AZ64: 


i 

1 


.9996 


nrrn 

.9550 


nno/ 1 

.9986 




RP 


.0584 


1 


.2812 


1 


.9902 


.7110 


.9804 




E 


.0594 


1 


.1162 


1 


.7720 


.6338 


.8632 


.7 


G 


.0418 


1 


.3104 


1 


.9744 


.3642 


.8830 




GE 


n c c o 

.0558 


1 


oo on 

.2380 


1 


f\P 70 

.9672 


.5642 


O 70 A 




RP 


.0598 


1 


.1754 


1 


.8888 


.3554 


.9036 




E 


.0690 


.9998 


.0720 


1 


.4342 


.2288 


.4108 


.8 


G 


.0500 


1 


.1638 


1 


.6804 


.0432 


.3284 




GE 


.0670 


1 


.1294 


1 


.6708 


.2216 


.4450 




RP 


.0654 


1 


.0996 


1 


.7144 


.1920 


.9076 




E 


.0902 


.9152 


.0880 


.7690 


.1836 


.1170 


.1686 


.9 


G 


.0346 


.9944 


.0636 


.9136 


.1574 


.0174 


.0574 




GE 


.0690 


.9926 


.0798 


.9206 


.2178 


.1040 


.1596 




RP 


.0736 


.9844 


.0678 


.8580 


.8328 


.3946 


.9774 



Table 6.4. Rejection rates at level .05 of a process defined by (14 )■ Sample size n = 1000 



6.1 A strictly stationary non-Gaussian process with Gaussian marginal 

In this subsection we discuss the behavior of the proposed procedure when used on a non- 
Gaussian process with Gaussian marginal. We have worked with the process introduced 
in Example 2.3 in [§]. Its construction is explained here for the sake of completeness. 
Let p be a prime number, and let Y , U and {Z m . p , m — 0, 1, . . .} be mutually indepen- 



2G 



dent random variables all uniformly distributed on {0,1,. ..,£> — 1}. Set 



J m-p+k 



J m-p 



(kY ) , k = 0,...,p-l,m = 0,1,2, 



where © stands for sum modulus p. According to [9] the sequence W n = Z n+ u is composed 
by pairwise independent random variables and it is stationary. Moreover, these random 
variables are not mutually independent because, by construction, for every m G N we 
have that 

Z m . p + Z m . p+ i + . . . + Z m . p+P _i = p{p — l)/2, 

and so, 

W mp _u + W mp „ u+1 + ... + W mp ^ u+p -i = pip - l)/2. (15) 

Therefore, the knowledge of the random variables W n -u,W n -u+i, ■ ■ ■ ,W n -u+p-2 com- 
pletely determines the value of W n -u+ P -i- 

Now, given k G {0, ...,p — 1}, let qk be the quantile of order k/p of the standard 
Gaussian distribution. For every n G N, let us define the random variable W* conditionally 
to W n as follows: If W n = k, then draw the value of W* with a standard Gaussian 
distribution conditioned to be in the interval <Zfc+i), and independent of all the other 
random variables. 

Since W n is uniformly distributed on {0, 1, . . . ,p — 1}, we obviously have that W* is a 
standard Gaussian r.v.. Moreover, the sequence (W*) inherits the remaining properties 
of (Wn). It is a strictly stationary sequence of pairwise independent Gaussian random 
variables. 

However, if n > p — 1 and we are aware of the values W*_u, . . . , W*_ u+ 2 , we can 
recover the values W n _u, . . . , W n _u+p-2 and, because of (15), we may deduce the value 
of W n -u+p-i. With this information, we know to which interval W*_ u+p _ x belongs. 
Therefore, the random variables iW*) n are not mutually independent and so, the process 
is not Gaussian. 

We have simulated the previous process 5000 times for different values of p and sample 
sizes n = 100, 500, 1000. Then, we have applied the RP test at the level a = .05. The 



rejection rates appear in Table 6.5 





p = 2 


p = 3 


p = 5 


p = 7 


p = 11 


p= 13 


p = 17 


n = 100 


.1448 


.1268 


.1676 


.1516 


.1602 


.1380 


.1146 


n = 500 


.3698 


.3654 


.4938 


.5154 


.5822 


.5590 


.5588 


n = 1000 


.6382 


.6386 


.6814 


.7250 


.7802 


.7608 


.7700 



Table 6.5. Rejection rates for different sample sizes applying the RP test to the W* 
process at the level a = .05. 



For comparison, we show in Table 6J3 the rates of rejection when using the E, G 
and GE tests in the case p = 5. Since these tests check for the non-Gaussianity of the 
marginal, the rejection rates are not too high. However, it is worth to pay some attention 
to the rejection rates in this table. To begin with, they are below the intended level 
(except GE with n = 100), but, more surprisingly, they show some decrease when the 
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sample size increases. We think that this is due to the fact that these tests see the process 
W* as more Gaussian than a Gaussian process. 

The reason is that when we generate observations of a Gaussian process, approximately 
a proportion of 1/p observations are in the interval (qk,Qk+i), with k G {0, .. . ,p — 1}. 
However, the process W* generate exactly a proportion of 1/p observations in each interval 
(qkyQk+i)- So that, it has a "more Gaussian" behavior than expected. Consequently the 
rejection rates are lower than .05 and this fact becomes more apparent when n increases. 





n = 100 


n = 500 


n = 1000 


E 


.0338 


.0266 


.0186 


G 


.0372 


.0336 


.0326 


GE 


.0520 


.0336 


.0206 



Table 6.6. Rejection rates using the E, G and GE tests of the W* process with p = 5 ; at 
the level a = .05. 



6.2 Real data 

In this subsection we work with the well-known Canadian lynx and Wolfer sunspot data 
in order to illustrate the behavior of the random projection test. The Canadian lynx data 
consists on the annual record of the number of lynxes trapped in the Mackenzie River 
district of the North- West Canada for the period from 1821 to 1934 while the Wolfer 
sunspot data consists on the annual record of the sunspot activity in the period from 
1700 to 1960. These data were used in [11] and previously in [30], obtaining in both cases 
that the processes are not Gaussian. 

We perform the random projection procedure to the lynx and sunspot data following 



the indications in Subsection 5.2 The obtained p- values are displayed in Table 6.7 together 



with those gotten in [TT] and in 





RP 


Epps S.R. & G 


lynx 


1.029 x 10" 4 


1.402 x 10~ 5 1.084 x 10" 4 


sunspot 


1.314 x l(T e 


7.356 x 10"° 2.818 x 10" 4 



Table 6.7. p-values using the HP-test and the tests proposed in [77] / and in ISfflJ for the 
lynx and sunspot data. 

In these examples we obtain p-values having approximatively the same magnitudes as 
those of pi] and [30]. 



7 Discussion 

In this paper we have introduced the random projection test, RP-test, to check the 
Gaussianity of stationary processes. Given a sample, this test is based in a three steps 
procedure. First, it is required to draw a vector h in a suitable Hilbert space. Then, 
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the sample is projected on the one- dimensional space spanned by h. Finally, we take 
advantage of the fact that, with probability one, the initial process is Gaussian if the 
marginal of the projected one is Gaussian. Therefore, we only need to use a test to 
check the Gaussianity of the marginal of a stationary process. In the final step we use a 
combination of the Epps and Lobato and Velasco tests. 

The comparison of the RP-procedure with the Epps and Lobato and Velasco tests (as 
well as with the combination of them) in situations in which the marginal is not Gaussian 
is not bad, and there are cases in which the proposed test is clearly better. Moreover, the 
RP test is able to detect alternatives with Gaussian marginal, while the other tests are 
not designed to do this task. 



In spite of the rejection rates shown in Table [6~5] are above the nominal level, they are 
not so high, mostly when the sample size is 100. A simple way to improve these rates is 
to increase the number of random projections using the correction described in Section 



3.4 



From Table 7.1 it can be seen how an increase in the number of employed random 
projections improves noticeably the rates. In this table half of the projections are taken 
using the /3(100, 1) distribution and the other half with the /5(2, 7) and in each case we 
compute half of the p- values with the E test and the other half with the G test. 





k = 2 


k = 3 


k = 5 


k = 8 


n = 100 


.1448 


.1906 


.2288 


.2674 


n = 500 


.3654 


.5772 


.6988 


.8064 


n = 1000 


.6814 


.7688 


.8498 


.8628 



Table 7.1. Rejection rates for different sample sizes applying the RP test with 2 k pro- 
jections to the W* process with p = 5. 
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